Text Recognition, Historical Manuscripts, Character Segmentation, Paleographic Processing
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
arxiv.orgยท11h
Positional Embeddings in Transformers: A Math Guide to RoPE & ALiBi
towardsdatascience.comยท1h
Research on smartphone image source identification based on PRNU features collected multivariate sampling strategy
sciencedirect.comยท1h
More on GPT-5 pseudo-text in graphics
languagelog.ldc.upenn.eduยท2d
ISALux: Illumination and Segmentation Aware Transformer Employing Mixture of Experts for Low Light Image Enhancement
arxiv.orgยท11h
Probabilistic Classification of Near-Surface Shallow-Water Sediments using A Portable Free-Fall Penetrometer
arxiv.orgยท11h
Propose and Rectify: A Forensics-Driven MLLM Framework for Image Manipulation Localization
arxiv.orgยท11h
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
arxiv.orgยท11h
ArgusCogito: Chain-of-Thought for Cross-Modal Synergy and Omnidirectional Reasoning in Camouflaged Object Segmentation
arxiv.orgยท11h
Loading...Loading more...